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ABSTRACT 


We  present  the  results  of  a  Monte  Carlo  Study  of  the  power 
of  a  new  multivariate  goodness -of- fit  test. 
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INTRODUCTION 


Foutz  [11  developed  a  new  test  of  goodness -of- fit  for  multi¬ 
variate  distributions.  The  test  can  also  be  used  to  fit  univariate 
ributiefts^In  an  earlier  paper  \j£]  the  authors  compared  the 
Foutz  test  with  the  Chi-square  and  Kolmogorov -Smirnov  test.  The 
results  indicated  that  the  Foutz  test  is  more  powerful  in  detecting 
certain  characteristics  than  the  other  two  tests.  This  paper  deals 
with  the  performance  of  the  test  when  fitting  multivariate  distri¬ 
butions.  More  specifically  we  -4-nv«lt4g-ate  the  power  of  the  test 

when  fitting  bivariate  and  tri variate  normal  distributions  for 

,<  1 

various  choices  of  the  mean  vector  and  the  covariance  matrix,  In 

•  s'  ffe5 

the  second  section  we  present^  a  brief  description  of  the  Foutz 
test;  a  discussion  of  the  simulation  procedure  is  in  the  third  sec¬ 
tion  and  the  results  of  the  simulation  are  in  the  final  section. 


4,^ 
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THE  FOUTZ  TEST 

Let  X  be  a  random  sample  of  size  n-1  from  a 

p-variate  continuous  distribution.  The  first  step  of  the  Foutz 
test  is  to  divide  the  sample  space  into  n  statistically  equivalent 
blocks  3p3->, . . .  ,3n  and  then  determine  a  "continuous  empirical 
distribution  function  (CEDF)"  Hn-  The  test  statistic  for  the  hypo¬ 
thesis  that  the  true  c-d*f  is  H  is 


p  a  Sup 

n  all  events  3 


pnm 


PHCS) 


where  Pn(s)  and  PH(S)  are  the  empirical  and  hypothesized  probabi¬ 
lity  measures  of  an  event  3,  computed  from  Hn  and  H  respectively. 

An  equivalent  computational  formula  for  Fn  is 

n  l 

F _  *  l  max  [0,  -  D.  ] 

n  *•  *  n  iJ 

where  *  P[X  z  Si J H] . 

A  general  procedure  for  the  construction  of  statistically  equi¬ 
valent  blocks  is  the  following.  Select  n-1  "cutting  functions" 
hk(X),  k  *  1,  2 . n-1,  such  that  h^QO  has  a  continuous  distri¬ 

bution,  and  a  permutation  k^ ,k, , . . . ,kn_^  of  the  integers  1,2,..., n-1. 
Order  the  samples  X-  according  to  the  value  of  hv  (X.);  let  X(k.)  be 

A  —X  ““  1 

the  vector  associated  with  the  k^th  order  statistic.  Partition  the 
sample  space  into  two  blocks  and  B?  defined  by 

BpIX:  hjj  (X)  s  h^  [X(kx)  ] }  and  B2  =  B^  . 

At  the  second  step,  if  k2  <  kp  order  the  k^  sample  vectors  in  3^ 
according  to  h^  (X)  and  let  X(k2)  be  the  k,th  order  statistic.  Par¬ 
tition  B^  into  two  sub-blocks  Bp  and  Bp;  at  this  stage  the  sample 
space  is  partitioned  into  three  blocks  Bp,  Bp  and  B,Q  s  Bv  If 
k,  >  kp  order  the  n  -  1  -  k^  sample  values  in  the  block  B, 
according  to  h^  (X) .  Let  Xfk-0  be  the  (k-,  -  k^)th  ^rder  statistic 
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and  partition  the  block  into  two  sub-blocks  and  take 
B10  3  Br  Continue  the  process  until  all  the  cutting  functions  a^e 
exhausted;  this  results  in  a  partition  of  the  sample  space  into  n 
statistically  equivalent  blocks  3^,  8-,,...,  8^.  More  details  on 
the  procedure  and  some  examples  are  available  in  [3]. 

The  null  c-d*f  of  the  test  statistic  Fn  (necessary  to  deter¬ 
mine  the  critical  values)  is  quite  difficult  to  derive  even  for 
small  n;  for  n  =  3,  4,  5  formulas  for  the  exact  c-d*f  are  in  [2]. 
Foutz  proposed  a  large  sample  normal  approximation  with  mean 
u  *  e'1  and  o2  *  (2e‘^  -  5e‘^)/n.  In  our  earlier  study  [2]  we  found 
that  with  this  approximation  the  observed  significance  level  is 
about  10-20%  smaller  than  the  nominal  values  for  sample  size  n  -  1  = 
20,  30,  50.  We  therefore  proposed  the  use  of  empirically  generated 
(based  on  80,000  simulated  F^  values)  critical  points  in  Table  I 
below. 

TABLE  I.  EMPIRICAL  CRITICAL  VALUES  FOR  FOUTZ  TEST 


Sample  Size 

20 

30 

50 

Significance  level 

.10 

.42714 

.41903 

.40816 

.05 

.44865 

.43533 

.42116 

.01 

.48659 

.46579 

.44487 

We  have  also  generated  a  variation  of  the  large  sample  normal 
approximation  to  calculate  more  accurate  estimates  of  the  critical 
values.  We  are  presently  in  the  process  of  using  this  approxima¬ 
tion  to  generate  tables  for  a  spectrum  of  critical  values  and 
sample  sizes. 

SIMULATION  PROCEDURE 

To  investigate  the  power  of  the  Foutz  test  5,000  replicates 
each  of  samples  of  size  20  from  several  bivariate  and  trivariate 


normal  distributions  were  generated;  in  all  cases  the  hypothesis 
tested  was  that  the  samples  are  from  a  bivariate/trivariate  normal 
distribution  with  zero  mean  vector  and  covariance  matrix  the  Iden¬ 
tity.  The  true  values  of  the  means,  variances  and  covariances  for 
the  generated  samples  were  chosen  so  as  to  study  the  effect  of 
(i)  shifts  in  the  means  only  (ii)  shifts  in  the  variances  only 
(iii)  shifts  in  covariances  only  and  few  cases  involving  a  combina¬ 
tion  of  all  three. 

The  method  of  blocking  we  implemented  was  as  follows.  We  let 
the  samples  themselves  implicitly  determine  the  permutation  k. ,  k.,, 

. . . ,  k  .  and  the  order  of  the  cutting  functions  which  were  all 
taken  to  be  coordinate  functions  i.e.,  h^  (X)  *  the  1  coordi¬ 
nate  of  the  sample  vector  X,  for  some  i.  "'The  following  example 
with  p  3  2  (bivariate  samples)  will  illustrate  the  procedure.  Sup¬ 
pose  x.,  x..,  ...,  x  .  are  the  observed  sample  vectors.  The  first 

i  £  -n-i  q'i 

cut  on  the  p-dimensional  sample  space  is  made  at  x^  '  the  first 
coordinate  of  the  first  sample  vector  x^.  This  partitions  the  sam¬ 
ple  space  into  two  blocks  B^,  B2  defined  by 

1  coordinate  2  coordinate 
Bj  (-«>,  xj1^]  (-«,  +•) 


l2 

(x{1},  +») 

1 

8 

■f 

The 

second  sample  vector 

will  be  in  one  of  the  two  blocks 

^  or  B.,. 

Assume  that  it  is 

in  B^;  partition  into  two  sub-blocks 

1st  coordinate 

2n<*  coordinate 

l21 

o,  xph 

*22 

(x{U,  *“> 

4 


{">•) 

where  is  the  value  of  the  second  coordinate  of  the  sample 
vector  x?.  At  this  stage  the  sample  space  is  partitioned  into 
three  blocks  and  Continuing  this  process  (letting 

the  cutting  function  at  the  q1^  stage  to  be  x1-1^  the  rth  coordinate 
of  the  x  where  r  *  [  (q-1)  mod  p  «■  1]  )  until  all  the  sample  vec- 

tors  are  exhausted  will  result  in  a  partition  of  the  sample  space 
into  n  statistically  equivalent  blocks.  In  our  simulation  we 
used  both  rectangular  and  polar/spherical  coordinates  to  examine 
if  one  scheme  is  more  adept  at  detecting  certain  types  of  violations 
of  the  null  hypothesis.  When  using  spherical  coordinates  the  first 
coordinate  is  taken  to  be  the  radius  vector  (same  for  polar  coordi¬ 
nates),  the  second  coordinate  is  the  angle  in  the  horizontal  plane 
and  the  third  coordinate  is  the  angle  measured  from  the  vertical  axis. 
Figures  1  and  2  graphically  demonstrate  the  construction  of  the 
blocks  for  five  bivariate  samples  using  the  rectangular  and  polar 
coordinate  systems.  The  implicit  permutation  and  the  order  of  the 
coordinate  cutting  'unctions  are  also  included  in  the  figures.  It 
should  be  noted  tha-  for  polar/ spherical  coordinates  it  is  necessary 
to  make  an  additional  initial  cut,  which  we  took  at  9  =  0°. 

The  probability  contents  of  the  statistically  equivalent  blocks 
under  the  null  hypothesis  (multivariate  normal  with  Zero  mean  vector 
and  covariance  matrix  the  Identity)  are  easily  computable  as  products 
of  univariate  probabilities  (normal,  Chi  square  and  uniform)  for  the 
rectangular  coordinate  system  as  well  as  the  polar/ spherical  coordi¬ 
nate  system. 

FESULTS 

For  ease  of  comprehension,  the  results  for  shifts  of  mean  or 
variance/covariance  are  given  as  power  curves  in  Figures  3-7. 

All  power  curves  are  based  on  5,000  replications  at  the  .05  signi¬ 
ficance  level.  The  results  are  indicative  of  those  obtained  for 
significance  levels  of  .01  and  .1.  Full  details  are  available  in 
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Linhart  [3] . 

Shifts  in  mean  are  detected  well.  Figures  3  and  4  show  that  a 
shift  of  one  standard  deviation  in  the  mean  results  in  about  a  60% 
rejection  rate  for  both  bivariate  and  trivariate  samples.  The 
rectangular  method  of  blocking  consistently  resulted  in  a  higher 
rejection  rate  than  did  the  polar/ spherical  method. 

Power  curves  for  shifts  in  variance  are  given  in  Figures  5  and 

6.  Small  shifts  in  variance  are  not  detected  very  well,  but  larger 
shifts  and  shifts  in  more  than  one  component  resulted  in  higher 
rejection  rates.  Neither  blocking  method  produced  rejection  rates 
significantly  better  than  the  other  except  in  the  trivariate  case 
when  one  variance  was  shifted.  In  general,  the  polar/ spherical 
method  of  blocking  gives  slightly  higher  rejection  rates,  but  not 
consistently. 

The  power  curves  for  shifts  in  covariance  are  given  in  Figure 

7.  Except  for  highly  correlated  data,  neither  blocking  scheme 
detects  these  shifts  very  well.  The  polar/ spherical  method  of 
blocking  usually  gives  somewhat  higher  rejection  rates  than  the 
rectangular  method. 

To  test  for  possible  confounding  in  detection  of  shifts  in 
both  mean  and  variance/covariance,  we  made  the  series  of  runs 
listed  in  Tables  II  and  III.  Shifts  are  generally  of  increasing 
amount  as  one  looks  down  and  to  the  right  in  the  tables.  It  is 
also  true  that  rejection  rates  increase  as  one  looks  down  and  to 
the  right,  indicating  there  is  no  detectable  confounding.  The 
rectangular  method  of  blocking  generally  gave  better  results  with 
multiple  shifts. 

All  of  the  above  results  are  based  on  a  sanple  size  of  20  and 
significance  level  of  .05.  To  determine  possible  trends  over  sample 
size  or  significance  level  we  made  the  series  of  runs  given  in 
Tables  IV  and  V.  Larger  sample  sizes  generally  led  to  higher  rejec¬ 
tion  rates,  as  is  to  be  anticipated.  Violations  of  this  did  occur 


when  rejection  rates  were  close  to  the  significance  level,  i.e., 
the  shift  was  not  detected  very  well.  The  rectangular  blocking 
method  generally  gave  higher  rejection  rates. 

In  conclusion,  the  Foutz  test  is  adept  at  detecting  shifts 
mean,  less  powerful  at  detecting  shifts  in  variance,  and  poor  in 
detecting  correlated  variates  unless  they  are  highly  correlated. 
In  addition,  shifts  in  mean  are  not  disguised  when  a  shift  in 
variance/covariance  is  also  present. 
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TABLE  II 


REJECTION  RATES  FOR  MULTIPLE  SHIFTS  IN 
MEAN  .AND  VARIANCE  -COVARLANCE 

a  3  .05,  n  *  21* 

Sigma  /l  o\  /l  .6\  /z  o\  /  2  .  84q\  /  5  1.34^ 

\0  l)  y.6  lj  \0  2 )  \.849  2  /  \l.34  5  / 

Mean 


.0488 

.0912 

.1986 

.2500 

.9b58 

.0482 

.1176 

.1522 

.2162 

.9572 

.1710 

.2398 

.3110 

.3702 

.9720 

.1388 

.2482 

.2498 

.3402 

.9650 

.5606 

.7346 

.6384 

.6828 

.9820 

.4348 

.5952 

.5334 

.6316 

.9764 

.9418 

.8774 

.9350 

.3658 

.9892 

.8576 

.SS88 

.8722 

.8308 

.9840 

.9998 

.9998 

.9902 

.9950 

.9990 

.9950 

.9990 

.9772 

.9882 

.9964 

*  First  Entry  -  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 


TABLE  III 


REJECTION  RATES  FOR  MULTIPLE  SHIFTS  IN 
MEAN  AND  VARIANCE -COV.ARLANCE 

a  -  .05,  n  =  21* 

Sigma  /I  0  0 \  /  1  0  .6\  /  5  0  0 \  /  10  0  .95\  /  5  0  0  \ 

(  0  1  0  )  (  0  1  0  0  1  0  0  1  0  )  (  0  5  0  ) 

\0  0  1/  \  .6  0  1/  \0  0  1/  \  .95  0  1  /  \  0  0  5/ 

Mean 


.0440 

.0674 

.5392 

.7828 

.9770 

.0518 

.0582 

.4584 

.7840 

.9720 

.0480 

.1830 

.5708 

.7946 

.9832 

.0280 

.1176 

.5034 

.8020 

.9740 

.3728 

.6352 

.6852 

.8206 

.9888 

.1174 

.2912 

.6254 

.8422 

.9824 

.7400 

.9074 

.9270 

.9668 

.9930 

.7392 

.74S4 

.8602 

.9454 

.9870 

.9982 

.9956 

.9716 

.9736 

.9978 

.9726 

.9752 

.9742 

.9774 

.9976 

*  First  Entry  -  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 
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TABLE  IV 

REJECTION  RATES  FOR  INCREASING  SAMPLE  SIZES 


Sample  size  (n-1)  20  30  50 


Shift 

a  *  .01 

U  = 

/.s  \ 

.0574 

.0860 

.1270 

'  0  / 

.0430 

.0564 

.0754 

u  = 

/.5  ) 

.1294 

.2026 

.3652 

(.s  j 

1 

.1230 

.1418 

.2508 

Z  35 

•  3  \ 

.0126 

.0140 

.0176 

( .3 

l) 

.0136 

.0152 

.0170 

Z  =■ 

/  1 

0  1 

.1864 

.2722 

.4522 

'  0 

3/ 

.1578 

.2244 

.3744 

a  -  .05 

u  ■ 

(1) 

.1710 

.2170 

.2914 

.1388 

.1630 

.2238 

u  3 

/.s\ 

.3024 

.4144 

1.5/ 

.3164 

.3076 

.4826 

Z  = 

(  1 

•3\ 

.0576 

.0624 

.0728 

\  .3 

1  i 

.0656 

.0624 

Z  * 

/I 

,  .3786 

.4884 

.6756 

\  0 

3  1 

.  3342 

.4304 

a  *  .10 

u  = 

/.5  \ 

.2700 

.3228 

.4256 

\  0  / 

.2298 

.2658 

.3400 

u  * 

( *5  \ 

.4406 

.5424 

."190 

1.5  / 

.4534 

.4336 

.6066 

Z  * 

/  1 

.3  \ 

.1178 

.1174 

.1396 

\  .3 

i ; 

.1258 

.1196 

.1422 

Z  ” 

(  1 

0 ) 

.5150 

.6132 

.7800 

LK9 

3  / 

.5566 

.7160 

First  Entry  *  Rectangular  coordinate  blocking 
Second  Entry  -  Polar  coordinate  blocking 
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TABLE  V 


REJECTION  RATES  FOR  INCREASING  SAMPLE  SIZES 


size 

(n-1) 

20 

30 

SO 

a  «  .01 

•S  \ 

.0480 

.0680 

.  1036 

0 

.0280 

.0362 

.0526 

0  / 

.S  \ 

.1738 

.2932 

.5040 

.5  ) 

.0848 

.1662 

.3428 

•  s  / 

1  0 

■3  \ 

.0106 

.0138 

.0148 

0  1 

0 

.0124 

.0134 

.0144 

.3  0 

1/ 

3  0 

°\ 

.1606 

.2054 

.3528 

0  1 

0  ) 

.0838 

.1138 

.2080 

0  0 

1  / 

a  *  .05 


5  \ 

.1438 

.1974 

.2742 

0 

.1036 

.1256 

.1646 

0  / 

\ 

s  \ 

.3642 

.5118 

.7268 

5 

.2198 

.3588 

.5868 

s ) 

f 

1 

0 

■  3  \ 

.0468 

.0588 

.0656 

0 

li 

0  ) 

.0512 

.0488 

.0540 

3 

1 

.1  / 

3 

0 

°\ 

.3438 

.3976 

.5734 

0 

I 

0  ) 

.2074  * 
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